Engineering a Simplified 0-Bit Consistent Weighted Sampling
نویسندگان
چکیده
The Min-Hashing approach to sketching has become an important tool in data analysis, search, and classi cation. To apply it to real-valued datasets, the ICWS algorithm has become a seminal approach that is widely used, and provides state-of-the-art performance for this problem space. However, ICWS su ers a computational burden as the sketch size K increases. We develop a new Simpli ed approach to the ICWS algorithm, that enables us to obtain over 20x speedups compared to the standard algorithm. The veracity of our approach is demonstrated empirically on multiple datasets, showing that our new Simpli ed CWS obtains the same quality of results while being an order of magnitude faster.
منابع مشابه
Search Based Weighted Multi-Bit Flipping Algorithm for High-Performance Low-Complexity Decoding of LDPC Codes
In this paper, two new hybrid algorithms are proposed for decoding Low Density Parity Check (LDPC) codes. Original version of the proposed algorithms named Search Based Weighted Multi Bit Flipping (SWMBF). The main idea of these algorithms is flipping variable multi bits in each iteration, change in which leads to the syndrome vector with least hamming weight. To achieve this, the proposed algo...
متن کاملSearch Based Weighted Multi-Bit Flipping Algorithm for High-Performance Low-Complexity Decoding of LDPC Codes
In this paper, two new hybrid algorithms are proposed for decoding Low Density Parity Check (LDPC) codes. Original version of the proposed algorithms named Search Based Weighted Multi Bit Flipping (SWMBF). The main idea of these algorithms is flipping variable multi bits in each iteration, change in which leads to the syndrome vector with least hamming weight. To achieve this, the proposed algo...
متن کاملImpulsive Noise Elimination Considering the Bit Planes Information of the Image
Impulsive noise is one of the imposed defectives degrades the quality of images. Performance of many image processing applications directly depends on the quality of the input image. Hence, it is necessary to de-noise the degraded images without losing their valuable information such as edges. In this paper we propose a method to remove impulsive noise from color images without damaging the ima...
متن کاملAn Efficient Lapped Orthogonal Transform Image Coding Technique
A fast and computationally less complex coding techque is described which uses partial-LOT computation a l g o r i b and efficiently cfiscards perceptually insignificant high frequency transform coefficients. The coding process involves AC energy classification, human visual system weighted normalization and quantization. The values of normalization factors are image independent and governed on...
متن کاملUniform generation of RNA pseudoknot structures with genus filtration
In this paper we present a sampling framework for RNA structures of fixed topological genus. We introduce a novel, linear time, uniform sampling algorithm for RNA structures of fixed topological genus g, for arbitrary g > 0. Furthermore we develop a linear time sampling algorithm for RNA structures of fixed topological genus g that are weighted by a simplified, loop-based energy functional. For...
متن کامل